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THE COURTIS TESTS IN ARITHMETIC. 
By Philip A. Boyer. 

The efficiency of an operation is determined in large measure 
by the ability to evaluate results. In no department of educa- 
tional work are there greater opportunities for service to the 
individual pupil than in the scientific measurement of the 
product. The rapid multiplication of tests and measurements 
in the field of education is at once an indication of their useful- 
ness and a guarantee of their success in directing the search 
light of scrutiny to the minutiae of the teaching process. In- 
deed before one gets very far with such investigation he finds 
himself launched upon numerous related and more detailed 
studies, the sum total of which cannot fail to bring about a 
more intelligently purposeful contact between teacher and pupil. 

Teachers test pupils religiously and even more devoutly do 
they detest the scoring of papers. This is due entirely to the 
barrenness of the task. Give to the test a scientifically de- 
termined purpose; tie it up intimately with specific pedagogic 
method ; let it show to both teacher and pupil the precise degree 
of success attained in a very definite detail of subject matter; 
let this result be exactly comparable, not only with like scores 
from many other classes working under similar conditions, but 
also with the demands of society for certain and sure command 
of particular definitely useful details of knowledge; and the 
teacher is stolid indeed who does not thrill at the opportunity 
for real service. Testing of this kind furnishes a basis for the 
diagnosis of class and individual needs and suggests the im- 
mediate application of specific devices for improvement. 

The general field of mathematics furnishes abundant material 
especially well adapted to purposeful testing, and the funda- 
mental operations in arithmetic are peculiarly fitted to such 
treatment because they comprise a tool subject of definitely 
limited content, a subject which every child must know com- 
pletely, use with automatic precision and in which rate is as 
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122 THE MATHEMATICS TEACHER. 

essential as accuracy. From the many standard tests and scales 
in the field of arithmetic the concensus of opinion of educators 
points to the selection of a general test of achievement in the 
fundamentals as the type of test best suited for preliminary 
survey. The results of such tests can be secured and tabulated 
with comparative ease and may be referred to scientifically de- 
termined and generally accepted standards. Such results will 
indicate many possibilities for further investigation and experi- 
ment and it is in this sort of work that the more elaborate and 
detailed tests and scales will be found useful. 

The Courtis Standard Research Tests in Arithmetic Series B 
were chosen as best fitting the above conditions and were given 
in seven Philadelphia Schools in March, 191 8, with the follow- 
ing five purposes in view : 

1. To establish a criterion of judgment as to the comparative 
success of the present teaching of the fundamentals. 

2. To find a base from which to measure future progress. 

3. To provide for the development of definite, detailed and 
objective aims of instruction and drill in each grade, i. e., to 
establish reasonable standards of achievement. 

4. To furnish means for a diagnosis of the teaching of the 
fundamental operations in order to develop methods whereby 
the efficiency of instruction might be increased. 

5. To bring to the attention of teachers the striking individual 
differences existing among pupils of any given class and to 
attempt to fit instruction and drill to the varying needs of such 
pupils. 

Because of the very limited scope of the tests the attention of 
teachers could not fail to be drawn to the most minute elements 
of success and failure. The results were tabulated with en- 
thusiasm and even before comparisons could be made there 
were attempts on every hand to relate achievements to specific 
processes of teaching and drill. In the very natural search for 
explanations both the teaching method and the elements of the 
operations themselves were carefully analyzed, — a spirit of sci- 
entific inquiry was instituted. 

Comparisons of results were then made between classes, 
schools, and with the results obtained in other cities, as well as 
with the Courtis general medians and the Courtis standards. 
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Such comparisons are indeed interesting, but they must be made, 
as Mr. Courtis insists, with extreme caution. He says : " One 
should be careful to recognize that a score in a given test repre- 
sents merely a performance under the given conditions. Every 
one should take pains to give and score the tests under stan- 
dard conditions, but at best should expect to get from city- 
to-city comparisons only conclusions as to the nature and 
amount of relative progress and not judgements as to absolute 
achievements." The conditions mentioned include such things as 
time allowance, physical conditions, temperature, humidity and 
lighting. It should, therefore, be noted here that the tests were 
given to the group under consideration on dark, dismal and 
spiritless days. They were entirely new and strange to the 
great majority of pupils, and there must therefore have been at 
least a slight degree of tacit misunderstanding on the part of 
some pupils even with the most careful adherence on the part 
of the examiners to the standard directions of Mr. Courtis. 
Again, in Test No. 4 (division), the examples were arranged in 
a way wholly foreign to Philadelphia pupils. Even though the 
pupils were instructed to place the quotient to the right of the 
dividend, as was their custom, the lack of sufficient space for 
such quotient and the disconcerting vinculum above the dividend 
must have been disturbing factors. It should also be remem- 
bered that the scores here presented are March scores and as 
such are more nearly mid-year scores than the May or June 
scores with which they are compared. This condition alone 
would warrant us in expecting the scores to be lower than 
Standard June scores by at least one third of the grade interval 
in each case. These variant conditions will have to be kept in 
mind when comparisons are made with future results in the 
same schools and also when we draw comparisons with the 
results attained in other cities. 

Class, grade and school medians have been tabulated for pur- 
poses of internal comparison, but only the median scores of the 
total pupilage of the seven schools tested will be presented here 
as of general interest. These are given in Table I, and will be 
herein referred to as the Philadelphia March medians. The 
relation of these medians to the Courtis General medians may 
be studied here, though for purposes of general comparison the 
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relations are shown more clearly in Graphs I and II. In Graph 
I, representing rate of work, the Philadelphia District medians 
are indicated by light full lines ; the Courtis medians are shown 

TABLE I. 
Median Scores. Courtis Standard Tests. Arithmetic Series B. 





Addition. 


Subtraction. 


Multiplication. 


Division. 


Grade. 


| Accu- 
Kate.* i racy * 


Rate. 


Accu- 
racy. 


Rate. 


Accu- 
racy. 


Rate. 


Accu- 
racy. 


IV. Phila. March Scores . . . 
Prov. June Standards. . 
Phila. June Scores .... 
Courtis Gen. Medians . 

V. Phila. March Scores . . . 
Prov. June Standards . 
Phila. June Scores .... 
Courtis Gen. Medians . 

VI. Phila. March Scores. . . 
Prov. June Standards . 
Phila. June Scores .... 
Courtis Gen. Medians . 

VII. Phila. March Scores. . . 
Prov. June Standards . 
Phila. June Scores .... 
Courtis Gen. Medians . 

VIII. Phila. March Scores. . . 
Prov. June Standards . 

Phila. June Scores 

Courtis Gen. Medians . 


4.4 
6 

5-8 
7.4 

5-4 
7 

7.8 
8.6 

6.9 
8-5 
8.9 
9.8 

' 7.8 

95 

10.3 

10.9 

8.7 
10.5 
11 
11.6 


57 
64 
64 
64 

64 
70 
70 
70 

72 
73 
74 
73 

7i 
75 
78 
75 

77 
76 
78 
76 


5-2 
6 

6.8 
7-4 

6-5 
8 

8.2 
9 

7-8 

9-5 

9-5 

10.3 

9-4 
11 

11.4 
11.6 

10.6 
12 
12. 1 
12.9 


59 
80 
82 
80 

77 
83 
84 
83 

82 
85 
89 
85 

84 
36 
90 
86 

88 
87 
92 
87 


4.4 
6 

6.0 
6.2 

6-3 

7 

7-6 

7.5 

7-6 
8.5 
95 
9.1 

8.6 

9.5 
10.6 
10.2 

97 
10.5 
12 
II-5 


57 
67 
69 
67 

72 

75 
78 
85 

78 
78 
84 
78 

82 
80 

85 
80 

85 
81 
88 
81 


3.0 
4 

43 
4.6 

4.2 

5-5 

6 

6.1 

52 
7 

8.3 
8.2 

6.8 

8-5 

10.3 

9.6 

8.1 

95 

11.6 

10.7 


39 

57 
67 
57 

59 
'/7 
82 

77 

77 
87 
88 
87 

84 
90 
93 
90 

91 
91 
99 
91 



* Rate is here indicated in number of examples completed ; accuracy in 
per cents. 

by the broken lines. It will be seen at a glance that the Phila- 
delphia scores show a progress almost exactly parallel with that 
indicated by the Courtis medians, but this is rather cold comfort 
when it is noted that Philadelphia medians run three attempts 
less than the Courtis medians in addition, two attempts less in 
subtraction and division, and more than one less in multiplica- 
tion. The Courtis tests do not explain situations, do not diag- 
nose ; they merely state facts, the interpretation of which often 
calls for extended investigation, and it was with this sort of in- 
vestigation that the Philadelphia committee proposed to occupy 
itself. The Philadelphia rate of work must be increased, not 
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through hurry or " speed," but rather through careful revision 
of teaching method to bring about automatization of reactions. 

In Graph II, which represents accuracy medians, we find a 
closer approximation of the Philadelphia medians to those of 
Courtis but only in the seventh grade in multiplication and in 
the eighth grade in all operations do they exceed those medians. 
It will be noted in this graph in all the operations and especially 
in subtraction that we do not find the same parallelism of the 
Philadelphia and Courtis results, as was the case in rate. The 
greater difference in grade four would seem to indicate that 
we are not developing proficiency in the fundamentals sufficiently 
early in the grades. In view of the necessity for constant use 
of these processes in the arithmetic work of the upper grades, 
the question may be raised whether it would not be more efficient 
to develop the fundamentals early. This, again, was a problem 
for the committee to study in detail. It may be found that the 
great progress from grade four to grade five ought to be moved 
down to between grades three and four, or rather that we should 
bring about in grade three what is now achieved in grade four. 

Probably the most important result of the comparisons of the 
Philadelphia and Courtis results came from the establishment 
of definite and detailed aims for the work of each grade. 
These aims were set approximately midway between March 
achievements and the Courtis medians, for it did not seem 
reasonable to expect pupils to attain Courtis median proficiency 
in a period of three months. Presumably attainable goals were 
accordingly set up for each grade in each operation. These 
goals were to be reached in June when another test was 
promised.* The procedure here outlined is somewhat arbi- 
trary to be sure but it established working aims which furnished 
compelling incentive to persistent and constructive endeavor on 
the part of pupils and teachers alike. Instead of looking for- 
ward with dread to the possibility of failing to make an " aver- 
age" of 70 in some loosely determined and vague "examina- 
tion," fourth-grade children, for example, knew that they would 
be expected to work in eight minutes six addition examples of 
a given type with sixty per cent, accuracy. Each child knew 
where he stood at the time and what he must work for, and 

* See Provisional June Standards — Table I. 
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teachers were only too willing to assist in the process. From 
week to week the process of the class toward these definite 
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goals was shown graphically and individual pupils were en- 
couraged to graph their own progress. The finest kind of 
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rivalry — that with one's own past achievements — was set up for 
each class and each individual. 

As was pointed out in the above discussion both individual and 
class scores showed particular successes and failures. These 
were symptoms which required analysis, diagnosis and specific 
treatment. Ideally, of course, each individual should have in- 
struction peculiar to his special need but with the pressure of 
many other kinds of work this is a practical impossibility in 
classes of forty to fifty pupils unless one of the few good devices 
for individual practice is used. These were available to only a 
small percentage of the classes under consideration. 

Hence after a study of results in relation to class room pro- 
cedure and methods of drill, the committee in charge of the 
work attempted to standardize the conditions. Each class was 
divided, according to achievement in each of the four funda- 
mentals, into two groups. The smaller of these groups was 
made up of pupils who did notably poor work in the operation 
concerned. An attempt was made to keep this group small 
enough to permit of individual attention on the part of the 
teacher. The larger and better group had its drill as a class 
exercise. A time limit of eight minutes per day was set for 
drill with each group so that each teacher spent sixteen minutes 
per day in this work and each pupil spent eight minutes. Be- 
cause of the change in the form of division to that used in the 
Courtis tests, two days, Monday and Friday, were given over 
to division. The three remaining operations were studied on 
Tuesday, Wednesday and Thursday respectively. In regard to 
method, the attention of teachers was called to the difficulty en- 
countered in bridging the tens and to the time-wasting practice 
of saying, either audibly or inaudibly, the numbers of a given 
combination instead of automatically producing the answer. 
This practice definitely allied itself with silent reading. Zero 
difficulties, trial divisor difficulties, and others were analyzed, 
and suggestions as to specific methods and devices were 
presented. 

To impress teachers further with the futility of uniform class 
drill, diagrams were prepared showing the wide range of varia- 
tion in the attainments of the pupils of each grade in each opera- 
tion. The diagram for only one of the operations (addition) 
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is shown here in Chart III. The portion outlined in solid line 
represents the spread of achievements in March and may be 
read as follows : 1 per cent, of eighth-grade pupils attempted 2 
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additions ; 2 per cent, attempted 3 ; 3 per cent, attempted 4 ; 10 
per cent, attempted 5 examples and so on to 1 per cent, who 
attempted 19 examples. The full vertical lines represent March 
medians, and it will be seen that there is a regular progress from 
grade to grade, but the total progress from grades 4 to 8 is 
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small compared with the great range of ability within a single 
grade. The exceptional cases are far too numerous to be 
neglected. Note for example, that about 3 per cent, of fourth- 
grade pupils attempt more examples than do half of the eighth- 
grade pupils, and that about 3 per cent, of eighth-grade pupils 
attempt fewer examples than do half of the fourth-grade pupils. 
The amount of individual variation or spread of achievement 
increases as we go up through the grades indicating that class 
method's are adapted to the average or brighter pupils at the 
expense of those at the lower end of the range. 

At the right of this chart is indicated the spread of accuracy 
scores. The ideal achievement, of course, is 100 per cent, ac- 
curacy. It will be interesting to note that grades 4, 5 and 6 
show a somewhat higher per cent, of perfect accuracy than do 
grades 7 and 8, and in no case except that of division is there 
a constant increase in the number of perfect scores through the 
grades. There is, however, in every case a constant decrease 
through the grades in the per cent, of scores showing less than 
50 per cent, accuracy, and a general tendency is evident for the 
per cent, of higher accuracies to increase with advancing grades. 
The small number of 90 per cent, accuracies is explained by the 
tact that a pupil must attempt 10 examples before this score 
becomes possible. 

After less than three months of special drill in the manner and 
under the time limits described, Form 2 of the same series of 
tests was administered. The June tests were given under more 
favorable conditions generally than .those which prevailed in 
March. The weather was favorable, the pupils had been pre- 
paring for just such a test, they were accustomed to its arrange- 
ment and conditions and the form of the division operation was 
now familiar. Accordingly a marked improvement was anti- 
cipated. The results of this June test are given in Table I as 
June Medians and are indicated graphically by the heavy full 
line in Graphs I and II. 

Examination of Graph I will show that the improvement in 
median rate of work for June over that of March was roughly 
two examples in addition, one and one half examples in sub- 
traction, two examples in multiplication and \y 2 to 3^2 examples 
in division. It is interesting to note in multiplication and espe- 
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cially in division the gradually increasing improvement in rate 
as we ascend through the grades. Addition and subtraction 
rates fall short of the Courtis Medians by about Y* example; 
multiplication and division rates equal or slightly exceed the 
Courtis medians. In every case the June medians exceed the 
Provisional June standards by from J4 to I example. 

The definite aim of the special drill on the fundamentals was 
to increase rate of work without diminishing accuracy. Pro- 
visional June standards for accuracy were therefore not ad- 
vanced materially over March achievements. Examination of 
Chart II shows that the drill nevertheless did increase accuracy 
even at the higher rate of work. June accuracy medians equal 
or exceed those of Courtis. It appears that increased rate 
means more complete automatization and therefore produces 
greater accuracy. 

A further purpose of the special daily eight minute drill on 
fundamentals was to reduce the great range of variation in 
achievements within a given grade. It was for this reason that 
classes were divided into groups according to ability and special 
individual attention given to the weaker group. An indication 
of the success or failure of this proceedure may be studied in 
Graph III where the total spread of achievements in addition is 
diagramed for both March and June. It will be noted that 
June scores (in broken lines) show a wider range of variation 
in attempts in every grade than do the scores of March. Not- 
withstanding the fact that the pupils who did the better work 
in March were given only class drill, it is they who seem to 
have profited most by the exercise. Here, it seems, is sufficient 
warrant for eliminating a large part of the drill with pupils who 
have reached or exceeded the Courtis medians. These pupils 
should be free to engage in other work more appropriate to their 
immediate needs. While there is a noticeable reduction in the 
number of pupils who continue to make poor scores in June, 
practically all of the poorer grades of attainment have some 
representatives. These poorer pupils are the " problems " of 
the class room. Even individual treatment has failed to reach 
them in three months of drill. Some may lack ability, but most 
of them gradually will achieve success as the result of more 
persistent and continued effort. In accuracy, the June scores 
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show a more even spread and there is less variability than in 
the March scores. 

If we take the quartile deviations for these addition scores we 
have a crude though satisfactory measure of their comparative 
variation. Table II shows that June variations are larger in 
rate and smaller in accuracy (except eighth-grade accuracy) 

TABLE II. 
Quartile Deviations (Q) in Addition Scores for March and June. 





Grade. 


Rate. 


Accuracy. 




March. 


| 


June. 


March. 


June. 


IV 


1.2 
1-3 
1-5 
14 
2.0 


i 
j 

i 


1-5 
1.5 
2.0 
2.2 
2.6 


25.0 
27.0 

17-3 
16.2 
13-7 






20.7 
16.5 
15-0 
15-3 
139 




V 




VI 




VII 




VIII 















than the corresponding March variations, which fact would seem 
to indicate that pupils who fail to make gains in rate tend to 
compensate by increased accuracy. This assumption will be 
borne out by inspection of Graph III where the lower per cents 
of accuracy are shown to be materially reduced. It is interest- 
ing to note further in Table II that quartile deviations in rate 
increase through the grades while deviations in accuracy de- 
crease. Pupils of the lower grades have a rather uniform rate 
of work and show very diverse accuracy scores. As practice 
continues, the rate of work becomes more diverse and accuracy 
tends to become uniform. This condition would indicate that 
in the lower grades accuracy should be emphasized even at the 
expense of rate while in the higher grades more and more 
emphasis should be placed upon standard rates of work. 

Stanton School, 

Philadelphia, Pa. 



